Pitfalls of supervised feature selection
نویسندگان
چکیده
منابع مشابه
Pitfalls of supervised feature selection
Pitfalls of supervised feature selection Pawel Smialowski1,2,∗, Dmitrij Frishman1,2 and Stefan Kramer3 1Department of Genome Oriented Bioinformatics, Technische Universität München Wissenschaftszentrum Weihenstephan, Am Forum 1, 85350 Freising, 2Helmholtz Zentrum Munich, National Research Center for Environment and Health, Institute for Bioinformatics, Ingolstädter Landstraße 1, 85764 Neuherber...
متن کاملSupervised Infinite Feature Selection
In this paper, we present a new feature selection method that is suitable for both unsupervised and supervised problems. We build upon the recently proposed Infinite Feature Selection (IFS) method where feature subsets of all sizes (including infinity) are considered. We extend IFS in two ways. First, we propose a supervised version of it. Second, we propose new ways of forming the feature adja...
متن کاملForward Semi-supervised Feature Selection
Traditionally, feature selection methods work directly on labeled examples. However, the availability of labeled examples cannot be taken for granted for many real world applications, such as medical diagnosis, forensic science, fraud detection, etc, where labeled examples are hard to find. This practical problem calls the need for “semi-supervised feature selection” to choose the optimal set o...
متن کاملFeature Selection Pitfalls and Music Classification
Previous work has employed an approach to the evaluation of wrapper feature selection methods that may overstate their ability to improve classification accuracy, because of a phenomenon akin to overfitting. This paper discusses this phenomenon in the context of recent work in machine learning, demonstrates that previous work in MIR has indeed exaggerated the efficacy of feature selection for m...
متن کاملInfomation based supervised and semi-supervised feature selection
We merge the results from both of supervised and semi-supervised feature selection techniques. The method was applied to the five datasets from NIPS feature selection competition. As a preprocessing step, we firstly discretize each training dataset using EM algorithm. Then, we filter the discretized dataset based on the MI (mutual information) value of each feature with respect to the class var...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2009
ISSN: 1460-2059,1367-4803
DOI: 10.1093/bioinformatics/btp621